Multi-class Imbalanced Data-Sets with Linguistic Fuzzy Rule Based Classification Systems Based on Pairwise Learning

نویسندگان

  • Alberto Fernández
  • María José del Jesús
  • Francisco Herrera
چکیده

In a classification task, the imbalance class problem is present when the data-set has a very different distribution of examples among their classes. The main handicap of this type of problem is that standard learning algorithms consider a balanced training set and this supposes a bias towards the majority classes. In order to provide a correct identification of the different classes of the problem, we propose a methodology based on two steps: first we will use the one-vs-one binarization technique for decomposing the original data-set into binary classification problems. Then, whenever each one of these binary subproblems is imbalanced, we will apply an oversampling step, using the SMOTE algorithm, in order to rebalance the data before the pairwise learning process. For our experimental study we take as basis algorithm a linguistic Fuzzy Rule Based Classification System, and we aim to show not only the improvement in performance achieved with our methodology against the basic approach, but also to show the good synergy of the pairwise learning proposal with the selected oversampling technique.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

A hierarchical genetic fuzzy system based on genetic programming for addressing classification with highly imbalanced and borderline data-sets

Lots of real world applications appear to be a matter of classification with imbalanced data-sets. This problem arises when the number of instances from one class is quite different to the number of instances from the other class. Traditionally, classification algorithms are unable to correctly deal with this issue as they are biased towards the majority class. Therefore, algorithms tend to mis...

متن کامل

Solving multi-class problems with linguistic fuzzy rule based classification systems based on pairwise learning and preference relations

This paper deals with multi-class classification for linguistic fuzzy rule based classification systems. The idea is to decompose the original data-set into binary classification problems using the pairwise learning approach (confronting all pair of classes), and to obtain an independent fuzzy system for each one of them. Along the inference process, each fuzzy rule based classification system ...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010